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Abstract 

The Horton and Tokunaga branching laws provide a convenient framework for studying self- 
similarity in random trees. The Horton self-similarity is a weaker property that addresses 
the principal branching in a tree; it is a counterpart of the power-law size distribution for 
elements of a branching system. The stronger Tokunaga self-similarity addresses so-called 
side branching. The Horton and Tokunaga self-similarity have been empirically established 
in numerous observed and modeled systems, and proven for two paradigmatic models: the 
critical Galton- Watson branching process with finite progeny and the finite-tree represen- 
tation of a regular Brownian excursion. This study establishes the Tokunaga and Horton 
self-similarity for a tree representation of a finite symmetric homogeneous Markov chain. 
We also extend the concept of Horton and Tokunaga self-similarity to infinite trees and es- 
tablish self-similarity for an infinite-tree representation of a regular Brownian motion. We 
conjecture that fractional Brownian motions are also Tokunaga and Horton self-similar, with 
self-similarity parameters depending on the Hurst exponent. 

Keywords: self-similar trees, Horton laws, Tokunaga self-similarity, Markov chains, 
level-set tree 
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1. Introduction and motivation 

Hierarchical branching organization is ubiquitous in nature. It is readily seen in river 
basins, drainage networks, bronchial passages, botanical trees, and snowflakes, to mention 
but a few (e.g., p], El El II])- Empirical evidence reveals a surprising similarity among various 
natural hierarchies — many of them are closely approximated by so-called self-similar trees 
(SSTs) [HEIEIEIEIIZIEIEI HQl [HI H21 H31 HH [T5l [16]. AnSST preserves its statistical 
structure, in a sense to be defined, under the operation of pruning, i.e., cutting the leaves; 
this is why the SSTs are sometimes referred to as fractal trees [2]. A two-parametric subclass 
of Tokunaga SSTs, introduced by Tokunaga [H] in a hydrological context, plays a special role 
in theory and applications, as it has been shown to emerge in unprecedented variety of 
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modeled and natural phenomena. The Tokunaga SSTs with a broad range of parameters are 
seen in studies of river networks JTJ [5j [8j |9j [101 [151 E] > ve i n structure of botanical leaves [2j E] , 
numerical analyses of diffusion limited aggregation [HI [TH] , two dimensional site percolation 
[TjJJ [201 ED 122], and nearest-neighbor clustering in Euclidean spaces [23J. The diversity 
of these processes and models hints at the existence of a universal (not problem-specific) 
underlying mechanism responsible for the Tokunaga self-similarity and prompts the question: 
What probability models may produce Tokunaga self-similar trees? An important answer to 
this question was given by Burd et al. [5] who studied Galton- Watson branching processes 
and have shown that, in this class, the Tokunaga self-similarity is a characteristic property 
of a critical binary branching, that is the discrete-time process that starts with a single 
progenitor and whose members equiprobably either split in two or die at every step. The 
critical binary Galton- Watson process is equivalent to the Shreve's random river network 
model, for which the Tokunaga self-similarity has been known for long time [H [5j [H [15]. 
The Tokunaga self- similarity has also been rigorously established in a general hierarchical 
coagulation model of Gabrielov et al. [21] introduced in the framework of self-organized 
criticality, and in a random self-similar network model of Veitzer and Gupta [H] developed 
as an alternative to the Shreve's random network model for river networks. 

Prominently, the results of Burd et al. [5] reveal the Tokunaga self-similarity for any 
process represented by the finite Galton- Watson critical binary branching. In the context 
of this paper, the most important example is a regular Brownian motion, whose various 
connections to the Galton- Watson processes are well-known (see Pitman [25] for a modern 
review). For instance, the topological structure of the so-called /i-excursions of a regular 
Brownian motion [26] and a Poisson sampling of a Brownian excursion [2Z] are equivalent 
to a finite critical binary Galton- Watson tree (Sect. [3] below explains the tree representation 
of time series), and hence these processes are Tokunaga self-similar. 

This study further explores Tokunaga self-similarity by focusing on trees that describe 
the topological structure of the level sets of a time series or a real function, so-called level- 
set trees. Our set-up is closely related to the classical Harris correspondence between trees 
and finite random walks [2H], and its later ramifications that include infinite trees with edge 
lengths [5J [T71 [23 I2H1 EH ED E21 EH]- The main result of this paper is the Tokunaga and 
closely related Horton self-similarity for the level-set trees of finite symmetric homogeneous 
Markov chains (SHMCs) — see Sect. |5j Theorem |1J Notably, the Tokunaga and Horton 
self-similarity concepts have been defined so far only for finite trees (e.g., [5j [151 El])- We 
suggest here a natural extension of Tokunaga and Horton self-similarity to infinite trees and 
establish self-similarity for an infinite-tree representation of a regular Brownian motion. The 
suggested approach is based on the forest of trees attached to the floor line as described by 
Pitman [25J. Finally, we discuss the strong distributional self-similarity that characterizes 
Markov chains with exponential jumps. 

The paper is organized as follows. Section [2] introduces planar rooted trees, trees with 
edge lengths, Harris paths, and spaces of random trees with the Galton- Watson distribution. 
The trees on continuous functions are described in Sect. [3} Several types of self-similarity for 
trees — Horton, Tokunaga, and distributional self-similarity — are discussed in Sect. [4} The 
main results of the paper are summarized in Sect. [5j Section [6] addresses special properties of 
exponential Markov chains that, in particular, enjoy the strong distributional self-similarity. 
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Proofs are collected in Sect. [71 Section [8] concludes. 
2. Trees 

We introduce here planar trees, the corresponding Harris paths, and the space of Galton- 
Watson trees following Burd et al. jS], Ossiander et al. [T7| and Pitman [25] . 

2.1. Planar rooted trees 

Recall that a graph Q = (V,E) is a collection of vertices (nodes) V = 1 < i < N v 
and edges (links) E = {e^}, 1 < k < N E . In a simple graph each edge is defined as 
an unordered pair of distinct vertices: VI < k < N E ,3\ 1 < i,j < N v ,i ^ j such that 
efc = (vi,Vj) and we say that the edge k connects vertices Vi and Vj. Furthermore, each pair 
of vertices in a simple graph may have at most one connecting edge. 

A tree is a connected simple graph T = (V, E) without cycles, which readily gives N E = 
Ny — 1. In a rooted tree, one node is designated as a root; this imposes a natural direction of 
edges as well as the parent-child relationship between the vertices. Specifically, we follow [5] 
to represent a labeled (planar) tree T rooted at <fi by a bijection between the set of vertices 
V and set of finite integer- valued sequences (ix, . . . , i n ) G T such that 

(i) *=<0>, 

(ii) if . . . ,i n ) G T then . . . , i k ) G T V 1 < k < n, and 

(iii) if (ii, ...,i n ) G T then (ix, . . . ,i n _ 1: j) G T VI < j < i n . 

This representation is illustrated in Fig. [TJ If v = (i\, . . . , i n ) G T then w = (ii, . . . , i n -\) G T 
is called the parent of t> , and v is a child of w. A Zea/ is a vertex with no children. The 
number of children of a vertex u = (z'i, . . . , i n ) G T equals to c(u) = max{j} over such j that 
(u, j) = (ix, . . . ,i n ,j) G T. A binary labeled rooted tree is represented by a set of binary 
sequences with elements i}~ = 1,2, where 1,2 represent the left and right planar directions, 
respectively. Two trees are called distinct if they are represented by distinct sets of the 
vertex-sequences. We complete each tree T by a special ghost edge e attached to the root (ft, 
so each vertex in the tree has a single parental edge. A natural direction of edges is from a 
vertex v to its parent v p . 

In these settings, the total number of distinct trees with n leaves, according to the 
Cayley's formula, is n n ~ 2 . The total number of distinct binary trees with n leaves is given 
by the (n — l)-th Catalan number [25] 

n \ n - 1 J 

2.2. Trees with edge-lengths and Harris path 

A tree with edge-lengths T = (V, E, W) assigns a positive lengths w(e) to each edge 
e, W = {u>(e)}; such trees are also called weighted trees (e.g., [21 HZ]). The sum of all 
edge lengths is called the tree length; we write length(T) = Y^ e w i e )- We call the pair 
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(V,E) a combinatorial tree and write (V,E) = SHAPE(T), emphasizing that the lengths are 
disregarded in this representation. 

If a tree is represented graphically in a plane, there is a unique continuous map 



a T : [0, 2length(T)] ->■ T 



that corresponds to the depth-first search of T, illustrated in Fig.[2[a). The depth-first search 
starts at the root of planar tree with edge- lengths and contours it, moving at a unit speed, 
from left to right so that each edge is traveled twice — its left side in a move away from 
the root, while its right side in a move towards the root. The Harris path for a tree T is a 
continuous function Ht{s) : [0, 2length(T)] — > R that equals to the distance from the root 
traveled along the tree T in the depth-fist search. Accordingly, for a tree T with n leaves, 
the Harris path Ht(s) is a continuous excursion — Ht(0) = -£/t(2length(T)) = and 
Ht{s) > for any s G (0, 2length(T)) — that consists of 2n linear segments of alternating 
slopes ±1 [25], as illustrated in Fig. |2^b). The closely related Harris walk H n (k), < k < 2n 
for a tree with n vertices is defined as a linearly interpolated discrete excursion with 2n 
steps that corresponds to the depth-first search that marks each vertex in a tree [281 125] . 
Clearly, the Harris path and Harris walk, as functions [0, 2length(T)] — > R, have the same 
trajectory. A binary tree with n leaves has 2n — 1 vertices; accordingly, its Harris path 
consists of 2n segments, and its Harris walk consists of 4n — 2 = 2(2n — 1) steps. 

2.3. Galton-Watson trees 

The space T of planar rooted trees with metric 



where r\n = {(ii, ■ ■ ■ ,ik) £ t : k < n} form a Polish metric space, with the countable dense 
subset T of finite trees [TTJ [5]. An important, and most studied, class of distributions on 
T is the Galton-Watson distribution; it corresponds to the trees generated by the Galton- 
Watson process with a single progenitor and the branching distribution {pk}- Formally, 
the distribution GW{ Pk y assign the following probability to a closed ball B (r, 1/n), r e T, 



where c(v) is the number of children of vertex v [5j [T7] . 

The classical work of Harris [2H] notices that the Harris walk for a Galton-Watson tree 
with unit edge-lengths, n vertices and geometric offspring distribution is an unsigned excur- 
sion of length 2n of a random walk with independent steps ±1. Hence, by the conditional 
Donsker's theorem [25J, a properly normalized Harris walk should weakly converge to a 
Brownian excursion. Aldous [291 EQl 131] ; LeGall [321 [33] , and Ossiander et al. |lL7J have 
shown that the same limiting behavior is seen for a broader class of Galton-Watson trees, 
which may have non-trivial edge-lengths and non-geometric offspring distribution. 



d(T,V) 



1 



1 + sup{n : r\n 



ijj\n} ' 



n = 1,2, ...: 
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Theorem 1. [I7J Theorem 3.1] Let T n be a Galton-Watson tree with the total progeny 
n and offspring distribution L such that gcd{j : P(L = j) > 0} = 1, E(L) = 1, and 
< Var(L) = a 2 < oo, where gcd{-} denotes the greatest common divisor. Suppose that 
the i.i.d. lengths W = {w(e)} are positive, independent ofT n , have mean 1 and variance s 2 
and assume that lim I _ !>00 (x log x) 2 P(\w(<j)) — 1| > x) — 0. Then the scaled Harris walk H n (k) 
converges in distribution to a standard Brownian excursion B^ x : 

{H n (2nt) /y/n, < t < 1} 4- {2cj- 1 B° x , < t < 1}, as n ^ oo. 

This paper explores an "inverse" problem — it describes trees that correspond to a given 
finite or infinite Harris walk. We show, in particular, that the class of trees that correspond 
to the Harris walks that weakly converge to a Brownian excursion B^ x is much broader than 
the space of Galton-Watson trees. 

3. Trees on continuous functions 

Let X t = X(t) G C ([L, R]) be a continuous function on a finite interval [L, R], L,R < oo. 
This section defines the tree associated with X t . We start with a simple situation when X t 
has a finite number of local extrema and continue with general case. 

3.1. Tamed functions: Level set trees 

Suppose that the function X t G C ([L,R]) has a finite number of local extrema. The 
level set C a (X t ) is defined as the pre-image of the function values above a: 

C a (X t ) = {t : X t >a}. 

The level set C a for each a is a union of non-overlapping intervals; we write \C a \ for their 
number. Notice that (i) \C a \ = \£p\ as soon as the interval [a, (3} does not contain a value 
of local minima of X t , (ii) \C a \ > \Cp\ for any a > (3, and (iii) < \C a \ < n, where n is the 
number of the local maxima of X t . 

The level set tree level(Xj) describes the topology of the level sets C a as a function of 
threshold a, as illustrated in Fig. [3j Namely, there are bijections between (i) the leaves of 
level(Xj) and the local maxima of X t , (ii) the internal (parental) vertices of level(X 4 ) 
and the local minima of X t (excluding possible local minima at the boundary points), and 
(iii) the edges of level(X 4 ) and the first positive excursions of X(t) — X(ti) to right and 
left of each local minima ti. The leftmost and rightmost edges (1,1,..., 1) and (2, 2, ... , 2) 
may correspond to meanders, that is to a positive segments of X(t) — X(ti), rather than to 
excursions. It is readily seen that any function X t with distinct values of the local minima 
corresponds to a binary tree level (JQ). In this case, the bijection (iii) can be separated into 
the bijections between (iii a) the edges (. . . , 1) of level(X 4 ) and the first positive excursions 
of X(t) —X(ti) to the left of each local minima tj, and (iiib) the edges (. . . , 2) of level(X 4 ) 
and the first positive excursions of X(t) — X(ti) to the right of each local minima ti. The 
edge e = (v,u) that connects the vertices v and u is assigned the length w[e) equal to the 
absolute difference between the values of the respective local extrema of X t — according to 
the bijections (i), (ii) above. 
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To complete the above construction, a special care should be taken of the edge e attached 
to the tree root. Specifically, let ti, i = 1, . . . , n, be the set of internal local minima of X t , 
defined as the set of points such that for any i there exists such an open interval (eij, bj) 3 ti 
that X{ti) < X(s) for any s G X{tj) < X(bi), and X(ti) < X(s) for any s G (ai,ti). 

The last definition treats only the leftmost point of any constant-level through as a local 
minima. The root of the tree level(Xj) corresponds to the lowest internal minimum. If the 
global minimum M of X t is reached at one of the boundary points, say at X(L), the root of 
level(Xj) has the parental edge e with the length w(e) = mim, (X(ti)) — X(L). At the same 
time, if the global minimum M of X t , is reached at one of the internal local minima, that 
is if M = mini (X(U)) < min (X(L),X(R)), then \£ a \ = for any a < M and \C a \ > 1 for 
any a > M. In other words, the root of level(A 4 ) does not have the parental edge. In this 
case, we add the ghost parental edge e with edge length w(e) = 1. We write level(X 4 , w(e)) 
to explicitly indicate the length of the ghost edge that might be added to the level-set tree 
and save notation w(e) for the value defined above uniquely for each function X t . 

By construction, the level set trees are invariant with respect to monotone transforma- 
tions of time and values of X t : 

Proposition 1. Let F(-) and G(-) be monotone functions such that Y t = F (Xoft)) ^ s a 
continuous function on G ([L, R]) . Then the function Y t has the same combinatorial level set 
tree as the original function X t , that is 

SHAPE (LEVEL(X t , 1)) = SHAPE (LEVEL(Y t , 1)) . 

The tree with edge lengths level(Xj, 1) is completely specified by the set of the local 
extrema of X t and its boundary values, and is independent of the detailed structure of 
the intervals of monotonicity. To formalize this observation, we write £x( s ) f° r the linear 
extreme function obtained from X t by (i) linearly interpolating its consecutive local extrema 
and the two boundary values, and (ii) changing time within each monotonicity interval as 
to have only constant slopes ±1. The function Sx{s) hence is a piece- wise linear function 
with slopes ±1. The length of the domain of this function equals the total variation of X t . 
We shift this domain to start at s = w(e) + X(L) — minj (X(ti)), where ti are the points of 
internal local minima as defined above. 

Proposition 2. The level set tree of a function X t coincides with that of the linear extreme 
function £ x : level(X 4 , 1) = level (£ x , 1) . 

The particular domain specification of £x{z) is explained by the following statement. 

Proposition 3. Let H T (s), s G [0, 2length(T)] be the Harris path of the level set tree 
T = level (At, 1), then Ht(z) = £x{z) on the domain D of Ex- The domains of Ht(z) 
and 8x{z) coincide, i.e. D = [0, 2length(T)] ; if and only if X t is a positive excursion, and 
D C [0, 2length(T)] otherwise. 

It is known that each piece-wise linear positive excursion (Harris path) that consists of 
2n segments with slopes ±1 uniquely specifies a tree T with no vertices of degree 2 (e.g., 
[23]). Recall that a Harris path corresponds to the depth-first search that visits each edge in 



6 



a tree twice; hence the Harris path Ht over-specifies the corresponding tree T. Similarly, the 
function £x{s) uniquely specifies (and, probably, over-specifies) the tree level(X 4 , 1) with 
no vertices of degree 2. If X t has distinct values of the local minima, then £x{ s ) uniquely 
specifies the binary tree level (X t , 1). 

Our definition of the level-set tree cannot be directly applied to a continuous function 
with infinite number of local extrema, say to a trajectory of a Brownian motion. This 
motivates the general set-up reviewed in the next section (25], [32] . 

3.2. General case 

Let X t = X(t) eC([L,R]) and X_[a, b] := inf te [ 0) &] X(t), for any a,b 6 [L, R}. We define 
a pseudo-metric on [L, R] as 

d x (a,b) := (X(a) - X[a,b]) + (X(b) - X[a,b\) , a,be[L,R]. (1) 

It is easily verified that if X t is the Harris path for a finite tree T and ot is the corresponding 
depth-first search, then dx(a, b) equals the distance along the tree T between the points ar(a) 
and cr T (6) (see Fig.^J). We write a ~x b if dx(a, b) = 0. Accordingly, we define tree tree(X) 
for the function X t as the metric space ([L,R]/ ~ x , dx) [23] . 

Remark. The definition of the level set tree can be readily applied to a real- valued Morse 
function / : M — > R on a smooth manifold M. This is convenient for studying functions 
in higher-dimensional domains; see, for instance, Arnold [36] and Edelsbrunner et al. [37] . 
The Harris-path and metric-space definitions are not readily applicable to multidimensional 
domains. 



4. Self-similar trees 

This section describes the three basic forms of the tree self-similarity: (i) Horton laws, 
(ii) Self-similarity of side-branching, and (iii) Tokunaga self-similarity. They are based on 
the Horton-Strahler and Tokunaga schemes for ordering vertices in a rooted binary tree. The 
presented approach was introduced by Horton [6] for ordering hierarchically organized river 
tributaries; the methods was later refined by Strahler [7] and further expanded by Tokunaga 
[9] to include so-called side-branching. 

4-1. Horton-Strahler ordering 

The Horton-Strahler (HS) ordering of the vertices of a finite rooted labeled binary tree 
is performed in a hierarchical fashion, from leaves to the root (2j [5J El [7] : (i) each leaf has 
order r(leaf) = 1; (ii) when both children, c\, C2, of a parent vertex p have the same order r, 
the vertex p is assigned order r(p) = r + 1; (iii) when two children of vertex p have different 
orders, the vertex p is assigned the higher order of the two. Figure [5](a) illustrates this 
definition. Formally, 

r(p) = l r(ci) + 1 if r(ci) = r(c2) ' (2) 

y J \ max (r(ci), r(c 2 )) if r(ci) ^ r(c 2 ). 
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A branch is defined as a union of connected vertices with the same order. The branch 
vertex nearest to the root is called the initial vertex, the vertex farthest from the root is 
called the terminal vertex. The order f2(T) of a finite tree T is the order r(<f>) of its root, or, 
equivalently, the maximal order of its branches (or nodes). The magnitude of a branch 
i is the number of the leaves descendant from its initial vertex. Let N r denote the total 
number of branches of order r and M r the average magnitude of branches of order r in a 
finite tree T. 

An equivalent, and intuitively more appealing, definition of the Horton-Strahler orders is 
done via the operation of pruning [T3] . The pruning of an empty tree results in an empty 
tree, TZ(<p) = <fi. The pruning 1Z(T) of a non-empty tree T, not necessarily binary, cuts the 
leaves and possible chains of degree-2 vertices connected to the leaves. A vertex of degree 2 
(or a single-child vertex) v is defined by the conditions (v, 1) G T, (v,2)^T. Each chain of 
degree-2 vertices connected to a leaf is uniquely identified by a vertex v such that (v, u) G T 
implies u — (1, . . . , 1). The pruning operation is illustrated in Fig. |6} 

The first application of pruning to a binary tree T simply cuts the leaves, possibly 
producing some single-child vertices. Some of those vertices are connected to the leaves via 
other single-child vertices and thus will be cut at the next pruning, while the other occur 
deeper within the pruned tree and will wait for their turn to be removed. It is readily seen 
that repetitive application of pruning to any tree will result in the empty tree 0. The minimal 
VL such that IZ^ (T) = is called the order of the tree. A vertex v of tree T has the order r if 
it has been removed at the r-th application of pruning: v G TZ^ k '(T) VI < k < r, v ^ TZ^ r '(T). 
We say that a binary tree T is complete if any of the following equivalent statements hold: (i) 
each branch of T consists of a single vertex; (ii) orders of siblings (vertices with the common 
parent) are equal; (iii) the parent vertex's rank is a unit higher than that of each of its 
children. There exists only one complete binary tree on n = 2 k leaves for each k = 0, 1, . . . ; 
all other trees are called incomplete. 

4-2. Tokunaga indexing 

The Tokunaga indexing [21 El US] extends upon the Horton-Strahler orders; it is illustrated 
in Fig. [5]b. This indexing focuses on incomplete trees by cataloging side-branching, which is 
the merging between branches of different order. Let r^-, 1 < k < Nj, 1 < % < j < £1 denotes 
the number of branches of order i that join the non-terminal vertices of the k-th branch of 
order j. Then Nij = r^, j > i is the total number of such branches in a tree T. The 
Tokunaga index is the average number of branches of order % < j per branch of order j 
in a finite tree of order Q > j: 

T„ = |. (3) 

In a probabilistic set-up, one considers a space of finite binary trees with some probability 
measure. Then, Ni, r^., JVy, and become random variables. We notice that if, for a given 
{ij}, the side-branch counts are independent identically distributed random variables, 

^non rwr f no 1 cnxr r\T 1 < 



T ii = r «?' t nen ) by the law of large numbers, 



Tij E as Nj ^ 00, 
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where the almost sure convergence X r fj, is understood as P ( lim X r — /j, ) = 1 . 

For consistency, we denote the total number of order-z branches that merge with other 
order-i branches by Na and notice that in a binary tree Na = 2N i+ i. This allows us to 
formally introduce the additional Tokunaga indices: Tu = Nu/N i+1 = 2. The set {T^}, 
1 < i < Q — 1, 1 < j ' < Q, i < j 'of Tokunaga indices provides a complete statistical 
description of the branching structure of a finite tree of order ft. 

Next, we define several types of tree self-similarity based on the Horton-Strahler and 
Tokunaga indexing schemes. 

4-3. Horton laws 

The Horton laws, widely observed in hydrological and biological networks [31 El HU Q2] , 
state, in their ultimate form, 

N r M r + 1 

Rbi „ - = Rm, Rbi Rm > 0, r > 1, 



N r+1 M, 

where N r , M r is, respectively, the total number and average mass of branches of order r in a 
finite tree of order Q. McConnell and Gupta |33] emphasized the approximate, asymptotic 
nature of the above empirical statements. In the present set-up, it will be natural to formulate 
the Horton laws as the almost sure convergence of the ratios of the branch statistics as the 
tree order increases: 

— -> R B > 0, for r > 1, as — >■ oo, (4) 



N r+ 1 
Mr+1 



Rm > 0, as r, Q — >■ oo. (5) 



Notice that the convergence in Q is seen for the small-order branches, while the convergence 
in ([5]) — for large-order branches. We call @,((5]) the weak Horton laws. We also consider 
strong Horton laws that assume an almost sure exponential dependence of the branch char- 
acteristics on r in a tree of finite order Q and magnitude N: 

N r ~ N N Rg r , for r > 1, as 00, (6) 
M r ~ M R r M , asr,fi^oo (7) 



for some positive constants A^ , M , Rb and Rm and with x r ~' y r staying for 

P ( lim x r /y r = 11=1. 



Clearly, the strong Horton laws imply the weak Horton laws. The inverse in general is not 
true; this can be illustrated by a sequence M r = R r M r c , for any C > 0, for which the weak 
Horton law (|5| holds, while the strong law ^ fails. We notice also that f2 — > 00 implies 
iV — > 00, but not vice versa; an example is given by a comb — a tree of order Q = 2 with an 
arbitrary number of side branches with Tokunaga index {12}. This is why the limits above 
are taken with respect to fl, not N. 
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The strong Horton laws imply, in particular, that 



N r ~ const M~ a , a = (8) 

log R M 



for appropriately chosen r — > oo and Q — > oo, for instance r = yVt. The relationship (|8j) is 
the simplest indication of self-similarity, as it connects the number iV r and the size M r of 
branches via a power law. However, a more restrictive property is conventionally required 
to call a tree self-similar; it is discussed in the next section. 

4-4- Tokunaga self- similarity 

In a deterministic setting, we call a tree T of order Q a self-similar tree (SST) if its 
side-branching structure (i) is the same for all branches of a given order: 

r ;<= = : Tih 1 < k < Nj, 1 < i < j < tt, 

and (ii) is invariant with respect to the branch order: 

Ti(i+k) = %+fc) =:T k ior2<i + k<n. (9) 

A Tokunaga self-similar tree (TSST) obeys an additional constraint first considered by Toku- 
naga [H]: 

T k+1 /T k = c T k = a c k ~ l a, c> 0, 1 < k < Q - 1. (10) 

In a random setting, we say that a tree T of order Q is self-similar if E \ r^ i+k -A =: T k for 
1 < j < Ni +k , 2 < i + k < Q; and it is Tokunaga self-similar if, furthermore, the condition 



(|10J) holds. 

In a deterministic setting, for a tree satisfying the weak Horton and Tokunaga law^j one 
has |9l[l5]: 

2 + c + a+^{2 + c + ay-8c 
K B = ■ (llj 

Peckham [15] has noticed that in a Tokunaga tree of order Q one has N r = Mn_ r+ i, which 
implies that the Horton laws for masses M r follow from the Horton laws for the counts N r 
and Rm = Rb- McConnell and Gupta [M] have shown that the weak Horton laws with 
Rb = Rm hold in a self-similar Tokunaga tree. Zaliapin [35] has shown, moreover, that 
strong Horton laws hold in a Tokunaga tree and, at the same time, even weak Horton laws 
may not hold in a general, non- Tokunaga, self-similar tree. 

The Tokunaga self-similarity describes a two-parametric class of trees, specified by the 
Tokunaga parameters (a, c). Our goal is to demonstrate that the Tokunaga class is not only 
structurally simple but is also sufficiently wide. This study establishes the Tokunaga self- 
similarity for the level-set trees of symmetric homogeneous Markov chains, and, as a direct 
consequence, for the trees of their scaling limits including a regular Brownian motion. 



'in a deterministic setting, the convergence in the Horton laws is understood as the convergence of 
sequences. 
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4-5. Stochastic self- similarity 

Burd et al. jS] define stochastic self- similarity for a random tree r G (T , P) as the 
distributional invariance with respect to the pruning TZ(t): 



P(-\t ^ <P)oTZ- 1 = P(-) 

and prove the following result that explains the importance of Tokunaga self-similarity within 
the class of Galton- Watson trees as well as the special role of the Galton- Watson critical 
binary trees. 

Theorem 2. Theorems 1.1, 1.2, 3.17] Let r G (T , GW{ Pk }) with bounded offspring 
number. Then the following statements are equivalent: 

(i) Tree r is stochastically self-similar. 

(ii) E(Tj(j + &)) =: Tfc, i.e., the expectation is a function ofk andT^ is defined by this equation. 

(Hi) Tree r has the critical binary offspring distribution, po = P2 = 1/2. 

These authors show, furthermore, how the arbitrary binary Galton- Watson distribution 
is transformed under the operation of pruning. 

Theorem 3. (5j Proposition 2.1] Let r be a finite tree with a binary Galton- Watson distri- 
bution, po + P2 = 1, with p2 < 1/2. Let r n+ i = lZ{r n ), n > 0, r = r. Then r n+ i has the 
binary Galton- Watson distribution p n+1 ^ +P2™ = 1 w ^ 



(n+l) 



in) 



-i 2 



(n) 

Po 



+ 



On) 
P 2 



2 ■ 



We demonstrate below that stochastic (or distributional) self-similarity, within the class 
of tree representations of homogeneous Markov chains, holds only for Markov chains with 
symmetric exponential increments. 



5. Main results 

Let Xfc, k G Z be a real valued Markov chain with homogeneous transition kernel 
K(x,y) = K(x — y), for any x, y G R. We call JY^ a homogeneous Markov chain (HMC). 
When working with trees, will also denote a function from C (R) obtained by liner inter- 
polation of the values of the original time series X k ; this create no ambiguities in the present 
context. 

A HMC is called symmetric (SHMC) if its transition kernel satisfies K(x) = K(—x) for 
any x G R. We call an HMC exponential (EHMC) if its kernel is a mixture of exponential 
jumps. Namely, 

K{x) = p<j) Xu {x) + {1 - p) <j) Xd (-x), 0<p< l,X u ,X d >0, 
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where 0a is the exponential density 




(12) 

We will refer to an EHMC by its parameter triplet {p, X u , Xd}- 

The concept of tree self-similarity is based on the notion of branch order and is tightly 



connected to the pruning operation (Sect. |4.1 Fig. pi). In terms of time series (or tamed 



real functions), pruning corresponds to coarsening the time series resolution by removing the 
local maxima. An iterative pruning corresponds to iterative transition to the local minima. 
We formulate this observation in the following proposition. 

Proposition 4. The transition from a time series X& to the time series Xjp of its local 
minima corresponds to the pruning of the level-set tree level(X). Formally, 

LEVEL (X (m) ) = TZ m (LEVEL (X)) , Vm > 1, 

where X^ is obtained from X by iteratively taking local minima m times (i.e., local minima 
of local minima and so on.) 

The next result establishes invariance of several classes of Markov chains with respect to 
the pruning operation. 

Lemma 1. (a) The local minima of a HMC form a HMC. (b) The local minima of a SHMC 
form a SHMC. (c) The local minima of an EHMC with parameters {p, X u , A^} form a EHMC 
with parameters {p*, A*, X^}, where 

P* = — ; — ~77^ — rr~, Kt=P*d, and X* u = {l-p)X u . (13) 
pX d + (l-p) X u 

Let {M t } = {Mj 1 **}, t G 71 C R, be the set of local minima of X t , not including the 
boundary minima; {M 4 }, t G C R, be the set of local minima of local minima (local 
minima of second order), etc., with {M t }, t G 7} C R being the local minima of order 
j. We call a segment between two consecutive points from %, r > 1, a (complete) basin of 
order r. For each r, there might exist a single leftmost and a single rightmost segments of 
X t that do not belong to any basin or order r, with a possibility for them to merge if X t 
does not have basins of order r at all. We call those segments incomplete basins of order 
r. There is a bijection between basins (complete and incomplete) of order r in X t and 
branches of Horton-Strahler order r in level (X^). This explains the terms complete branch 
and incomplete branch of order r. 

Theorem 4 (Horton and Tokunaga self-similarity). The combinatorial level set tree 
shape (level (X), I) of a finite SHMC X k , k = 1, . . . , N satisfies the strong Horton laws 
for any r > 1, asymptotically in N: 

N r ~ N R^ r , R B = 4, as X ^ oo. (14) 
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Furthermore, T = shape (level(X, 1)) is a Tokunaga self-similar tree with parameters 
(a, c) = (1,2). Specifically, for a finite tree T of order Q(N) the side-branch counts T?, i+k -, 
with 2 < i + k < Q for different complete branches j of order (i + k) are independent 



identically distributed random variables such that T?, i+k * =: r^i+k) an d 

E [r i(i+fe) ] =: T k = 2 k ~\ (15) 
Moreover, Vt -4-' oo as N — » oo and, for any i,k > 1, we have 

T l{i+k) ^T k = 2 k ~\ asiV^oo, 
where T^n^ can be computed over the entire X k . 

Next we extend this result to the case of infinite time series and the weak limits of finite 
time series. For a linearly interpolated time series X t , t > (equivalently, for a continuous 
function with a countable number of separated local extrema) consider the descending ladder 
L x = {t : X t = X_[0, t]}, which in our settings is a set of isolated points and non-overlapping 
intervals (Fig. |7|. The function X t is naturally divided into a series of vertically shifted 
positive excursions on the intervals not included in Lx and monotone falls on the intervals 
from Lx- Any (in the a.s. sense) infinite SHMC can be decomposed into infinite number 
of such finite excursions and finite falls. We will index the excursions by index i > 1 from 
left to right. The extreme time series S (X l k ) for each finite excursion X\ is a Harris path 
for a finite tree level (X\). Hence, each such finite excursion completely specifies a single 
subtree of tree (Xt). In particular, it completely specifies the HS orders for all vertices and 
Tokunaga indices for all branches except the one containing the root within level (X\). We 
also notice that each fall of X t on an interval from Lx corresponds to an individual edge of 
tree (Xt)- Combining the above observations, we conclude that the tree tree (Xt) can be 
represented as infinite number of subtrees level (JQ) connected by edges that correspond to 
the falls of X t on the descending ladder, see Fig. [7j Pitman calls this construction, applied 
to the standard Brownian motion rather than time series, a forest of trees attached to the 
floor line [25], Section 7.4]. Let and denote, respectively, the number of branches of 
order r and the number of side branches of Tokunaga index {ij} in the first n excursions of 
X t as described above. We introduce the cumulative quantities 

It ■ N n ' ij ■ N n 

and define, for the infinite time series X t , 

Vr (X t ) = lim t£, T tJ (X t ) = lim (16) 

whenever the above limits exist in an appropriate probabilistic sense. 

By Proposition [TJ the level set tree of a finite excursion X* is not affected by monotonic 



transformations of time and value. This allows to expand the above definition (16) to the 



weak limits of time series via the the Donsker's theorem. In particular, if X t is a SHMC 
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whose increments have standard deviation a, then the rescaled segments X t weakly converge 
to the regular Brownian motion B t , < t < 1. Namely, 

X{nt)/^/nA oB t 

as n — > oo through the end point of the finite excursions that comprise X t . This leads to 
the following result. 

Corollary 1. The combinatorial tree shape (TKEE(Bt)) of a regular Brownian motion B t , 
t G [0, 1] satisfies the Horton and Tokunaga self-similarity laws. Namely, 

r) r (B t ) = 4 for r > 1 and T i{i+k) {B t ) = 2*" 1 for i, k > 1, (17) 

where the limits (IToT) are understood in the almost sure sense. 



We conclude this section with a conjecture motivated by the above result as well as 
extensive numeric simulations |23|. 



Conjecture 1. The tree shape (tree (-B^)) of a fractional Brownian motion Bf , t E [0, 1] 
with the Hurst index < H < 1 is Tokunaga self-similar with Ti^ i+ ^(B H ) = T k = c fc_1 , 



c = 2H + 1, i, k > 1. According to (11), this corresponds to the Horton self- similarity with 



r] r (B H ) = 2 + H + VH 2 + 2, r > 1. 



The sense of limits (16) is to be determined. 



6. Exponential chains 

This section focuses on exponential chains, which enjoy an important distributional self- 
similarity and whose level-set trees have the Galton- Watson distribution. 

6.1. Distributional self-similarity 

Consider a SHMC X^, fceZ with kernel 

K(x) = /(»)+/(-*) | 

where f(x) is a probability density function with support M + . The series of local minima of 
Xk (or, equivalently, pruning of X^) also forms a SHMC with transition kernel Ki(x) 
(see Lemma [jjb)). It is natural to look for chains invariant with respect to the pruning: 

X k = cXi 1 \c>0. (19) 
By Proposition[TJ such invariance would guarantee the distributional Tokunaga self-similarity: 

T l(i+k) ='■ T i{i+k) =T k , l<j< N i+k , 1 < i + k < tt, (20) 

where T k is a random number of side-branches of order i that join an arbitrarily chosen branch 
of order (i + k). Hence, we seek the conditions on f(x) to ensure that Ki(x) = c~ 1 K{x/c) 
for some constant c > 0. 
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Proposition 5. The local minima of a SHMC X^ with kernel K(x) form a SHMC with 
kernel 

K x (x) = v / , c>0 



if and only if c = 2 and 



5R 



/(2a) 



2-f(s 

where f(s) is the characteristic function of f(x) and $l[z] stays for the real part of z G C 



(21) 



Observe that the set of densities f(x) that satisfy (21) is not empty. A solution is given 
for example by the Laplace density with A > 0, e.g. for f(x) = 4>\(x) with exponential 



density <p x (x) of (12), that is by an EHMC {1/2, A, A}. 



6.2. Distributional self-similarity for symmetric exponential chains 

Lemma [T](c) allows one to study the behavior of the EHMCs formed by local minima, 
minima of minima, and so on of an EHMC Xf. with parameters {p, X u , A^}. Introducing the 
variables 

l ~ P 1 = ¥ (22) 



.4 



P 



A, 



one readily obtains that their counterparts {A*, 7*} for the chain of local minima, given by 



(13), are expressed as 



A* 



A 

7 



7 



7 
.4' 



(23) 



Notably, this means that the chain of local minima for any EHMC form an EHMC with 
Aj = 1. The only fixed point in the space (A, 7) with iteration rules (23) is the point 
(A — 1,7 = 1), which corresponds to the distributionally self-similar EHMS discussed in 
Sect. 6.1 This point is an image (under the pruning operation) of the EHMCs with A = 7 
or p \d = (1 — p) X u . The last condition is equivalent to E(Xk — X^-i) = for any k > 1. 
The chain of local minima for any EHMC with A > 7 (A < 7) corresponds to a point on the 
upper (lower) part of the hyperbola A 7 = 1. Any point on this hyperbola, except the fixed 
point (1, 1), moves away from the fixed point toward (0, 00) or (00, 0). This is illustrated in 



Fig. 11 It follows that the Tokunaga and even weaker Horton self-similarity is only seen for 
a symmetric EHMC. The above discussion can be summarized in the following statement. 

Theorem 5. Let X^ be an EHMC {p, X u ,Xd}- Then X^ satisfies the distributional self- 

1/2, X u 



similarity (19) if and only if p 



X n 



Furthermore, the multiple pruning xi m \ 



m > 1 of Xk satisfies the distributional self- similarity (19) if and only if the chain's incre- 
ments have zero mean, or, equivalently, if and only if pXd = (1 —p)X u . In this case, the 
self- similarity is achieved after the first pruning, that is for the chain X^' of local minima. 



Corollary 2. The regular Brownian motion with drift is not Tokunaga self-similar. 
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6.3. Connection to Galton- Watson trees 

An important, and well known, fact is that the Galton- Watson distribution (see Sect. 2.3) 
is the characteristic property of trees that have Harris paths with alternating exponential 
steps. We formulate this result using the terminology of our paper. 

Theorem 6. [21)1 Lemma 7.3].|32[[2^] Let Xk be a discrete-time excursion with finite number 
of local minima. The level set tree SHAPE (LEVEL(Xfc, 1)) is a binary Galton-Watson tree with 
Po + T>2 = 1 if and only if the rises and falls of Xk, excluding the last fall, are distributed 
as independent exponential variables with parameters (// + A) and (ji — X), respectively, for 
< A < In this case, 

/i + A ji — A 



Po 



P-2 



2/i 2/i 

We now use this result to relate sequential pruning of Galton-Watson trees (see Theo- 
rem [3]) and pruning of EHMCs. Consider the first positive excursion Xk of an EHMC with 
parameters {p^ = p = 1 — q, X u , A^}. The geometric stability of the exponential distribution 
implies that the monotone rises and falls of Xk are exponentially distributed with parameters 
qX u and p X d , respectively. The Theorem [6] implies that shape (level (Xk)) is distributed 
clS du binary Galton-Watson tree, Po +P2 = 1 ; with 

pX d 



P2 



(0) 
P 2 



q\ u +p\ d 



(24) 



The first pruning x9~ of Xk, according to (13), is the EHMC with parameters 



P 



(i) 



pK 



-,<?A n ,pA d 



q\ u +p\ a 

Its upward and downward monotone increments are exponentially distributed with param- 
eters, respectively, 



and 



(P A a 



By Theorem 



Galton-Watson tree, Pq + p^ 



qX u +pX d qX u +pX d 

the level-set tree for an arbitrary positive excursion of X k 
1, with 



(i) 



is a binary 



(p K 



( q x u y + (px d f 



Continuing this way, we find that n-th pruning X^ of Xk 



xi 0) is an EHMCs such that 



the level set tree of its arbitrary positive excursion have a binary Galton-Watson distribution, 
Po^ + = 1 1 with 



(n) 
P2 



(pK 



\2 n 



(qKT 

This can be rewritten in recursive form as 



+ (p^d) 



(n) 
P\ 



(n-1) 
P\ 



(n-1) 

Po 



n 2 



P\ 



2 • 



n > 1 
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with p^ given by (24). Notably, this is the same recursive system as that discovered by Burd 
et al. [5, Proposition 2.1] (see Theorem [3] above) in their analysis of consecutive pruning for 
the Galton- Watson trees. Another noteworthy relation is given by 

(n) (n-1) ^ -, (0) (0) 

p w=pv > t n>l, p ( 1 = p,p y 2 ' = p 2 , 

which connects the "horizontal" probability p^ of an upward jump in a pruned time series 
with the "vertical" probability p^ of branching in a Galton- Watson tree. 



7. Terminology and proofs 

1.1. Level-set trees: Definitions and terminology 

This section introduces terminology for discussing the hierarchical structure of the local 
extrema of a finite time series and relating it to the level set tree level(X). For 
consistency we repeat some terms introduced above to formulate Theorem |4| 

Let {M t } = {M 4 {1) }, t G Ti C R, be the set of local minima of X t} not including possible 

(2) 

boundary minima; {M 4 }, t G % C M, be the set of local minima of local minima (local 
minima of second order), etc., with {M t }, t G C K being the local minima of order j. 
Next, let {m s } = {m^}, s G <Si C R, be the set of local maxima of including possible 
boundary maxima, and {m^ }, s G C K the set of local maxima of {M t ^} for all 
j > 1. We will call a segment between two consecutive points from Tj a (complete) basin of 
order j. Clearly, 71 D 72 D ■ • ■ and each basin of order r is comprised of a non-zero number 
of basins of arbitrary order k < r. For each r, there might exist a single leftmost and a single 
rightmost segments of X t that do not belong to any basin or order r, with a possibility for 
them to merge if X t does not have basins of order r at all. We call those segments incomplete 
basins of order r. 

By construction, each basin of order j contains exactly one point from Sj] e.g., there 
is a single local maximum from S\ between two consecutive local minima from 7l, etc. 
There exists a bijection between basins (complete and incomplete) of order r in X t and 
branches of Horton-Strahler order r in level (JQ); this explains the terms complete branch 
and incomplete branch of order r. More specifically, there is a bijection between the terminal 
vertices of order-r branches — i.e., vertices parental to two branches of order (r — 1) — and 
the local maxima from Sj within the respective basins. 

Let us fix an arbitrary local minimum of order r^; then k G Tj for 1 < j < r^ and 
k ^ Tj for j > Tfc. For each j > r^ there exists a unique basin of order j that contains k; we 
denote the boundaries of this basin by l^\r^ G 4 < Denote by cf 1 the unique 
point from Sj within the interval Multiple points Xf. may correspond to the 

same triplet Ujjf\ c i"^> r k^ ■> w hich will create no confusion. These definitions are illustrated 
in Fig. [9} 

Consider now a point k of local minimum such that k ^ Uj>im^. If < k < cj^ for 
a given j > then we call the point 1^ the local minimum of order j adjacent to k and 
the point the local minimum of order j opposite to k. The analogous terminology is 
introduced in case < k < r[ f . By construction, X fc is always greater than the value of 
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its adjacent minimum of any order j > r^. The value of the opposite minimum of order j is 
denoted by Mjf\ We have, for each k, 

M { k 1] > Mf } > M® > . . . (25) 

We already noticed that the local maxima mf 1 correspond to the tree leaves, that is 
to its branches of Horton order r = 1. The set m t for each j > 1 corresponds to the 
vertices parental to two branches of the same HS order j; they are the terminal vertices of 
order- (j + 1) branches. All other local minima of Xj. correspond to vertices parental to two 
vertices of different SH order; we will refer to this as side-branching. Specifically, a local 
minimum X k of order i forms a side-branch of order {ij} if 

Mt 1] >X k > M<?\ (26) 



where the first inequality disappears when j — i + 1. Figure 10 illustrates this for a basin 
of second order. In general, each basin of order r contains a uniquely specified positive 
excursion attached to its higher end. The local maxima of order k < r from this excursion 
correspond to the side-branches with Tokunaga index {km} with m < r. The local maxima 
of order k < r within the basin but outside of this excursion correspond to the side-branches 
with Tokunaga index {km} with m > r. 

7.2. Proofs 

Proof of Propositions [l]j2](3] and [4| The statements readily follow from the definition of 
level set trees. □ 

Proof of Lemma [Q 

(a) Follows from the independence of increments in X^. 

(b) Let {Mj} be the sequence of local minima of X k and dj = M, + i — Mj. We have, for 
each j 

i=l i=l 

where £ + and £_ are independent geometric random variables with parameter 1/2: 



P(£+ = k) = P(£_ — k) — 2~ k , k = l,2, 



Yi, Zi are independent identically distributed (i.i.d.) random variables with density f(x). 
Here the first sum corresponds to £ + positive increments of X k between a local minimum 
Mj and the subsequent local maximum mj and the second sum to negative increments 
between the local maximum mj and the subsequent local minimum M J+1 . It is readily 



seen that both the sums in (27) have the same distribution, and hence their difference has 
a symmetric distribution. We notice that the symmetric kernel for the sequence of local 
minima {Mj} is necessarily different from K(x). 

(c) Consider an EHMC X k with parameters {p, X u , A^}. By statement (a) of this lemma, 
the local minima of X k form a HMC with transition kernel Ki(x). The latter is the probabil- 



ity distribution of the jumps dj given by (27) with £ + , £_ being geometric random variables 
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with parameters p and (1 — p) respectively, Y{ = 4>\ u , and Zi — cf)\ d . For the characteristic 
function of K\ one readily has 

t?( \ p(l-p)\ d \ u -r— . * \ / v 

Kl{3) = ((l-p)X u - lS )(pX d + l s) = P * ■ ^ {S) + (1 ■ ^(-*) 



with 

= x , P n d — rr~, K = P X d, and A* = (1 - p)\ u . 

Thus 

Ki(x) = p*<p\*S x ) + i 1 ~ P*)0a*(-z)- 

This means that the HMC of local minima also jumps according to a two-sided exponential 
law, only with different parameters p*, \ d and A*. □ 

Proof of Theorem [3} Horton self-similarity 

We notice that the number N r of order-r branches in LEVEL(X) equals the number |«S r | 
of local maxima rris of order r (with the convention that the local maxima of order are 
the values of Xk). The probability for a given point of Xk to be a local maximum equals 
the probability that this point is higher than both its neighbors. The Markov property and 
symmetry of the chain imply that this probability is 1/4. Hence the average number of local 
maxima is 

N— 1 

N-2 N 



E (|5 |) = E (iVO = P(^-i < Xi > X 



i+lj 



4 4 

1=2 

Let U denote the event (Xi is a local maximum). By Markov property, the events k, lj are 
independent for \i — j\ > 2; hence, the variance V(iVi) ex N. This yields 

lim E ( ^ ) = 1/4, lim V ( ^) = 0. 

One can combine the strong laws of large numbers for (i) the proportion of the upward 
increments of X t (that converges to 1/2) and (ii) the proportion of upward increments 
followed by a downward increment (that converges to 1/2) to obtain N\/N — i 1/4, and, in 
particular, N\ — > oo as N — > oo. 

We use now Lemma pTb) to find, applying the same argument to the pruned time series, 
that N r /N T ^i ^ 1/4 as N ->■ oo for any r > 1. Finally, 

iV r iV r Ar r _! Xi a . s . 

— = ... > 4 , X -)• oo, 

N JV r _iiV r _ 2 X 



which completes the proof of the strong Horton law ( 14 ) . □ 
The proof of the Tokunaga self-similarity will require several auxiliary statements formu- 
lated below. 

Lemma 2. A basin of order j contains on average 4 J ~ fc basins of order k, for any j > k > 1. 
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Proof of Lemma [2j We show first that a basin of order (J + 1) contains on average 4 
local minima of order j > 1. The number £ of points of Xk within a first-order basin (i.e., 
between two consecutive local minima) is £ = 1 + £+ + £_, where £ + , £_ are, respectively, the 
numbers of basin points (excluding the basin boundaries) to the left and right of its local 
maximum m; and the latter is counted separately in the expression above. The independence 
of increments of Xk impies 

P(£ + = A;) = P(^=A:) = 2- fc - 1 ,A; = 0,l,..., 

and hence 

E[£] = 1 + E[£ + ] + E[£_] = 1 + 1 + 1 = 3. (28) 

By Lemma [T|(b) , the same result holds for the average number of local minima of order j 
within an order-(j + 1) basin, for any j > 1. Thus, the average number of order- j basins 
within an order- (j + 1) basin is E[£] + 1 = 4. 

The independence of increments of Xk implies that the number of order-(j — 1) subbasins 
within an order- j basin is independent of the numbers of order- j basins within an order-(j + l) 
basin. This leads to the Lemma's statement. □ 

Lemma 3. Let a and b be two points chosen at random and without replacement from the set 
{1,2, . . . ,N} and r\ = (^1,^2,^3) denotes the random number of points within the following 
intervals respectively: (i) [1, min(a, b)), (ii) (min(a, 6), max(a, &)), and (Hi) (max(a,6), N}. 
Then the triplet 77 has an exchangeable distribution. 

Proof of Lemma [3f We notice that the triplet i] can be equivalently constructed by 
choosing three points (a, b, c) at random from (N + 1) points on a circle and counting the 
number of points within each of the three resulting segments. This implies exchangeability. 

□ 

Lemma 4. Let Y% £ R, i = 1, 2, . . . be i.i.d. random variables, a pair (n,m) G N 2 has an 
exchangeable distribution independent ofY i} and 

n n+m 
i=l i=n+l 

Then X has a symmetric distribution. 

Proof of Lemma g) Let A = n — m and F(X \ A) denote the conditional distribution of 
X given A. From the definition of X it follows that 

F(X I A = Jfe) = F(-X I A = —k). 

Exchangeability of (n, m) implies symmetry of A and we thus obtain 

F(X) = E F ( x I A = ife) P(A = k) 

k=— 00 
00 

= E I A = k ) + F ( X I A = ~ k )] P ( A = k ) 

00 

= E I A = £;) + F(-X | A = A;)] P(A = k). 

k=0 
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The sums of conditional distributions in brackets are symmetric, which completes the proof. 

□ 



Proof of Theorem [4f Tokunaga self-similarity 

We will show that lim = 2- ? ~ i ~ 1 for any pair j > i. By Lemma lib), = Tu+m 

TV— »QO 

and so it suffices to prove the statement for i = 1, that is to show that lim T l7 - = 2- ? ~ 2 for 

7V->oo 

an y j — 2. This will be done by induction. Below we use the terminology introduced in 
Sect. EH 

Induction base, j = 2. Consider a basin of order 2, formed by two consecutive points 
from 72 (local minima of second order). We denote here their positions by L and R, L < R. 
This part of the proof will consider only local minima from this interval; they will be referred 
to as "points". 



The highest local minimum, or point c 



.(2) 



G S2 forms a vertex parental to two 



branches of order 1 with Tokunaga indices {11}; in addition, a random number of local 
minima corresponds to internal vertices parental to side-branches with Tokunaga indices 
{lj}, j > 1. The number iV 12 ' of vertices of index {12} within (L,R) equals the number 
of side-branch points Xk that are higher than their opposite minimum of second order: 



N- 



(L,R) 
12 



#{L < k < R : X k > M^} 



For each side-branch vertex Xk we necessarily have < X c since X c is maximal among 
the local minima. Recall that the local minima form a SHMC Hence, for a randomly chosen 
side-branch X k we have 

e 

X c — Xk = Yi , 
i=i 

where £' is a geometric rv such that P(£' = k) — 2~ k , and Yi > are i.i.d. random variables 

(2) 

that correspond to the jumps between the local minima. Clearly, the difference X c — 

(2) 

has the same distribution. The random variables {X c — M k ) and (X c — Xk) are independent 



and so P [X h > M, 



r(2) 



1/2. The expected number of side-branches with index {12} within 



the interval (L, R) is 



N- 



(L,R) 
12 



'€-1 

E 

.fc=i 



l (0,oo) 



X,-M; 



(2) 



(30) 



The summation above is taken over (£ — 1) side-branch points within (L, R); and the random 
variables £ was described in Lemma [2} 

We show next that the random variables l(o,oo) \Xk ~ M^^J are independent of £. Sup- 
pose that there exist £ = iV points within (L, R). A particular placement of k and c among 
these points is obtained by choosing two points at random and without replacement from 
{1, . . . , N}. By Lemma [3j the conditional distribution of the numbers of points between k 
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and c and between c and the local minimum opposite to Xk have an exchangeable distribu- 
tion. Lemma D implies that P (x k > Mj® \ £ = Nj =1/2. Thus, 



N- 



(L,R) 
12 



E[f - 1] P (X fc > Mf ) =2x1/2 = 1. 



(31) 



The numbers iV 12 ' are independent for different basins of order 2 by Markov property of 
X t . The strong law of large numbers yields 

rj-, N12 a.s t o at . 

.Zi 9 = — > 1 = 2 as iV — >• oo. 



AT 9 



Induction step. Suppose that the statement is proven for j > 2, that is we know that for 
a randomly chosen local minima Xk 



P (X k > MjfA = 



and T y ™" 2^" 2 as JV — > oo. We will prove it now for (j + 1). Consider a randomly chosen 
side-branch point Xk of order {li}, i > j. By (26), X k < for 1 < m < j and thus 

necessarily Xk < c£ , 1 < z < since c^ +1 ^ is a local maximum of order-i minima within 
the basin (L, R) of order (j + 1) that contains k. Repeating the argument of the induction 
base we find that Xk — has a symmetric distribution for all i < j + 1 and that the 
probability of (Xk > ) is independent of the number of local maxima of order j within 
the basin (L, R). This gives, for a randomly chosen Xk, 



P[X k >M k 



(3+1) 



P [ X k > M 



AT fc > M 



Cj) 



P (jf fe > m( j+1) |X fc > M^) P (x k > M® 



= T x x 2- {j - l) = 2- j . 

By Lemma [2j the average number of order-2 basins within a basin of order (j + 1) is 
4 J_1 . Each such basin contains on average 2 points that correspond to side branches with 
Tokunaga index {1»}. Hence, the average total number of side-branches with index {1»} 
within a basin of order (j + 1) is 2 x AP~ X = 2 2j ~ 1 . Applying the Wald's lemma to the sum 



of indicators 1 



(0,oo) 



M^ +1 ^) over the random number of local minima of order j within 



the basin (L, i?), we find the average total number of side-branches of order {l(j + 1)}: 



N. 



(L,R) 

iO+i) 



x 



2 2j ~ 



2 3 ~ 



The strong law of large numbers yields 

A^iO'+i) 



T i(i+i) 



X 



2> 



-i 



as — )• oo. 



Cj+i) 



□ 
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Proof of Proposition [5j Each transition step between the local minima of can be 
represented as dj of (27) where {Y{\ and {Z{\ are independent random variables with density 
f(x), and £ + and £_ are two independent geometric random variables with parameter 1/2. 
The Wald's lemma readily implies that c = 2. This gives for the characteristic functions 



K^s) = 2 if (2s) 



/(2a) 



On the other hand, taking the characteristic function of dj we obtain 



2 - /(») 



which completes the proof. 



□ 



Proof of Theorem [5j The Tokunaga and Horton self-similarity for a symmetric EHMC 
was proven in Theorem |4| Here we show the violation of the Horton self-similarity for an 
asymmetric EHMC. 

Let Xj^ denote the time series obtained by m-time repetitive pruning of time series Xf~- 
Recall that there is one-to-one correspondence between the local maxima of X^ and the 



branches of order m in the level set tree level(AT) (see Sect. 7.1). Hence, the Horton self- 



similarity is equivalent to the invariance of the proportion of local maxima with respect to 
pruning. The proportion of local maxima in equals the probability for a randomly 
chosen point to be a local maxima. The Markov property of — Lemma [l|c) — implies 



that R 



(m) 



P 



V 



(m)i 



where p^ is the probability for an upward jump in 



For an asymmetric EHMC let be the m-th iteration of A, as in (22), (23). There, for 
m > 1, either A^ < 1 in which case A(™) or A™ > 1 in which case — > oo, all as 

Eq. (23) and Fig. 11). This corresponds to p( m > 

->■ 0. 



m — > oo (see Sect. 



6.2 



or p( m > o, respectively, and leads to P 1 
Tokunaga, self-similarity. 



Tm) 



l/(A( m ) + 1) -> 1 
This prohibits the Horton, and hence 



□ 



8. Discussion 

This work establishes the Tokunaga and Horton self-similarity for the level-set tree of 
a finite symmetric homogeneous Markov process with discrete time and continuous state 
space (Sect. [5j Theorem We also suggest a definition of self-similarity for an infinite tree, 
using the construction of a forest of subtrees attached to the floor line [25]; this allows us to 
establish the Tokunaga and Horton self-similarity for a regular Brownian motion (Sect. [5j 
Corollary [TJ . This particular extension to infinite trees seems natural for tree representation 
of time series, where concatenation of individual finite time series corresponds to the "hor- 
izontal" growth of the corresponding tree. Alternative definitions might be better suited 
though for other situations related, say, to the "vertical" growth of a tree from the leaves, 
like in a branching process. 

A useful observation is the equivalence of smoothing the time series by removing its local 
maxima and pruning the corresponding level-set tree (Sect. [5j Proposition [4]). It allows one to 
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switch naturally between the tree and time-series domains in studying various self-similarity 
properties. 

As discussed in the introduction, the Tokunaga self-similarity for various finite-tree rep- 
resentations of a Brownian motion follow from (i) the results of Burd et al. [5] on the 
Tokunaga self-similarity for the critical binary Galton- Watson process and (ii) equivalence 
of a particular tree representation to this process. We suggest here an alternative, direct ap- 
proach to establishing Tokunaga self-similarity in Markov processes. Not only this approach 
does not refer to the Galton- Watson property, it extends the Tokunaga self-similarity to a 
much broader class of trees. Indeed, as shown by Le Gall [32] and Neveu and Pitman [2E] 
(see Theorem [6]), the tree representation of any non-exponential symmetric Markov chain is 
not Galton- Watson; it is still Tokunaga, however, by our Theorem |4| 

Peckham and Gupta [16] have introduced the generalized Horton laws, which state the 

equality in distributions for the rescaled versions of suitable branch statistics S r : S r = 
R r s ~ k Sfc, Rs > 0. These authors established the existence of the generalized Horton laws in 
the Shreve's random model, that is for the Galton- Watson trees. Accordingly, one would 
expect the generalized Horton laws to hold for the exponential symmetric Markov chains. 
Veitzer and Gupta [UJ and Troutman [SB] have studied the random self-similar network 
(RSN) model introduced in order to explain the variability of the limiting branching ratios in 
the empirical Horton laws. They have demonstrated that the extended Horton laws hold for 
various branch statistics, including the average magnitudes M r , in this model. Furthermore, 
they established the weak Horton laws Q, ^ and Tokunaga self-similarity for the RSN 
model. Notably, the RSN model does not belong to the class of Galton- Watson trees, yet 
it demonstrates the Tokunaga self-similarity, similarly to the non-exponential symmetric 
Markov chains considered here. 

Tree representation of stochastic processes [221 ESI [29], [301 EB E21 [33] and real functions 
[361 [37] is an intriguing topic that attracts attention of mathematicians and natural scientists. 
A structurally simple yet flexible Tokunaga self-similarity, which extends beyond the classical 
Galton- Watson space, may provide a useful insight into the structure of existing data sets 
and models as well as suggest novel ways of modeling various natural phenomena. For 
instance, the level set tree representation have been used recently in analysis of the statistical 
properties of fragment coverage in genome sequencing experiments [391 l4"0~l |4"T] . It seems 
that some of the methods and results obtained in this work might prove useful for the gene 
studies. In particular, it looks intriguing to test the self-similarity of the gene-related trees 
and interpret it in the biological context. 

Notably, the results of this paper, as well as that of Burd et al. [5] , refer only to a single 
point (a, c) = (1,2) in the two-dimensional space of Tokunaga parameters. The empirical 
and numerical studies, however, report a broad range of these parameters, roughly 1 < a < 2 
and 1 < c < 4. This motivates a search for more general Tokunaga models; a potential broad 
family is suggested by our Conjecture [1} 

The construction of the level set tree is a particular case of the coagulation process; in 
the real function context it describes the hierarchical structure of the embedded excursions 
of increasing lengths and heights. Coagulation theory — a well-established field with broad 
range of practical applications to physics, biology, and social sciences [121 SSI H] — is heavily 
based on the concepts of symmetry and exchangeability [212 H2]- We find it noteworthy 
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that the only property used to establish the results in this paper is symmetry of a Markov 
chain. It seems worthwhile to explore the concept of Tokunaga self-similarity for a general 
coalescent process. 
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Figure 1: Representation of a tree via a set of finite sequences (ii, . . . , i n ). 




(a) Tree T 



(b) Harris path if ? 



Figure 2: (a) Tree T and its depth-first search illustrated by dashed arrows, (b) Harris path for the tree T 
of panel (a). 



(a) FunctionX, (b) Tree level(X) 

Figure 3: Function X t (panel a) with a finite number of local extrema and its level-set tree LEVEL(X) (panel 
b). 
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Figure 4: Illustration of the pseudo-distance dx{a,b) used to define the tree tree(X) for a continuous 
function X t . This example refers to a Harris path X t with the finite number of extrema, so one can 
construct a level set tree for X t . Here, the local maxima X(a) and X(b) correspond to the leaves a a and a b 
in the tree shown on the left. The distance between these points is measured along the shortest path from 
a a to ah along the tree (marked by heavy lines), or equivalently, by Eq. ([!]). 



11 11 11 




(a) Horton-Strahler orders (b)Tokunaga indices 

Figure 5: Example of (a) Horton-Strahler ordering, and of (b) Tokunaga indexing. Two order-2 branches 
are depicted by heavy lines in both panels. The Horton-Strahler orders refer, interchangeably, to the tree 
nodes or to their parent links. The Tokunaga indices refer to entire branches, and not to individual links. 




T fc(T) ft(7W(T))) 

Figure 6: Example of consecutive application of the pruning operation IZ(-) to the tree T. In this example 
the tree has order Q = 3 so (T) = <fi. For visual convenience the pruned branches are shown in all panels 
by a light color. Notice that pruning may produce chains of single-child nodes. 
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Figure 7: Illustration of tree construction for an infinite time series. The time series X t is divided here into 
two vertically shifted excursions, marked A and B in the time axis, and one fall, depicted by the heavy 
segment on the time axis. The descending ladder Lx consists of two isolated points and one interval (heavy 
segment on the time axis). The excursions correspond to the two trees represented by marked triangles, the 
interval from the descending ladder corresponds to the line that connects the trees A and B. 
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Basin of order 2 



Opposite minimum, Mj® Adjacent minimum 




Figure 9: Basin of order 2: an illustration. The figure shows a basin of order 2 that consists of 5 local 
minima. The figure illustrates the taxonomy used in the paper; it shows the local maximum m k of the 
basin's local minima, the opposite and adjacent minima of second order for a local minimum Xk, as well as 
the corresponding points l k 2 ^ , cjj, 2 ' , and ■ 



Basin of order 2 




Figure 10: Tokunaga indexing: an illustration. The figure shows the Tokunaga indexing for the local minima 
of the second order basin shown in Fig. [9] The values of k > 2 are determined by the large-scale structure 
of the function X t . 
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Figure 11: Characterization of EHMCs in the space (A 7) °f (22) with iteration rules (23) that correspond 
to the transition to the EHMC of local maxima. Each EHMC corresponds to a point on the plane (A, 7) . 
The chain of local minima for any EHMC corresponds to a point on the hyperbola A 7 = 1. The point 
(A = 1,7 = 1) is fixed. Any point from the lower branch (A > 1,7 < 1) moves along the hyperbola toward 
(oo,0). Any point from the upper branch (A < 1,7 > 1) moves along the hyperbola toward (0, 00). Arrows 
illustrate the point dynamics. 
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